home *** CD-ROM | disk | FTP | other *** search
- Path: grafix.xs4all.nl!john.hendrikx
- Date: Tue, 27 Feb 96 12:17:18 GMT+1
- Newsgroups: comp.sys.amiga.programmer
- Distribution: world
- Subject: Re: Amiga doesn`t need Pl
- MIME-Version: 1.0
- Content-Type: text/plain; charset=iso-8859-1
- Content-Transfer-Encoding: 8bit
- From: john.hendrikx@grafix.xs4all.nl (John Hendrikx)
- Message-ID: <john.hendrikx.4hkq@grafix.xs4all.nl>
- Organization: Private
-
- In a message of 24 Feb 96 Stephan Schaem wrote to All:
-
- >> John Hendrikx (john.hendrikx@grafix.xs4all.nl) wrote:The 'copying' loop
- >> simply doesn't exist on the clones, just paste it whereveryou want it
- >> in the gfx-buffer and be done with it. This is likely to be asfast as
- >> a DOOM clone of the A1200 but with the C2P pass disabled (ie, no
- >> display).
-
- SS> By past you mean copy from local mem to gfx card mem.... like fastmem
-
- That's a possibility, but on the clones you could even get away with writing
- each byte directly to the gfx-card (if this is slow, then meanwhile the CPU can
- continue to process the next pixel).
-
- SS> to chipmem. C2P is a small factor added to the copy, mainly because the
- SS> problem is really slow 'gfx card'/chip mem on the amiga.
-
- Not all gfx-cards on Amiga are slow, but you're gonna need to have a Z-III card
- to get good performance. It's only because ChipRAM is so slow that we can get
- C2P for 'free'.
-
- >> There is no Amiga with the same power as the P133. And my 'average' (and
- >> 2 year old $100) VLB Gfx card handles 15 MB/sec easily, more than enough
- >> to do 640x480 in 2 frames.
-
- SS> I would like to compare the latest gouraud/tmap loop on 68060 VS P5
- SS> runing all in the L1 cache. My guess is that is take more then 2x
- SS> the mhz for the pentium to get the same number of pixel rendered.
- SS> But out of the chip, expensive l2 cache and boosted mhz make a
- SS> diference in the overall performance.
-
- And you forget to mention that the Pentium has 64-bit memory access.
-
- SS> The cpu is not the problem, the amiga HW (From CBM) sux big time.
-
- I never said the 060 was the problem, for most Amiga's however even a 68060
- won't make a whole lot of a difference, unless you got a gfx-card (and
- preferably a Z-III one).
-
- >> There is no (good) clone because it requires a 040 + fast Chunky
- >> gfx-card, period. Caused of course by the fact that 040 + fast Chunky
- >> gfx-card is a rare combination found in the Amiga world.
-
- SS> The CPU is not the problem... and 030 can render doom in 'fastmem'
- SS> 'easy'.
-
- I don't think so, maybe the 50 MHz version, but they still won't get 25 FPS
- orso at 320x256 1x1 (talking DOOM here, not some WolfenStein clone with floors
- which I see all too often).
-
- SS> The killing factor is the slow video memory. an 030 compete easy with
- SS> a 486 in integer operation (mhz for mhz, not on an inst basis but on
- SS> an overall small cached loop... like tmap)
-
- Do you really think so? On 030 the fastest instructions available take 2
- cycles, while most instructions on 486 take 1 cycle. 486 also has much faster
- Mul and Div instructions. You would be better of comparing the 040 with the
- 486.
-
- >> Yes it does, see TextDemo. The percentage of CPU time used for the C2P
- >> is NON-EXISTANT on the clones, because the 'fast-ram buffer' we use on
- >> Amiga is called 'the screen' on the clones. No extra copying (or
- >> converting for that matter) needed.
-
- SS> Again the problem is not c2p but slow video memory...Does PC alway
- SS> cache video memory on the L1 cache? I hear many people rendering in
- SS> local mem then doing a copy.
-
- Of course the video memory is not cached in the L1 cache, for the same reason
- as ChipRAM isn't cached on Amiga. To copy the stuff to video ram why not
- simply ask the DMA controller to copy that shit for you while you render the
- next frame? Also why wouldn't the same trick to get 'free' cycles on Amiga
- while doing ChipRAM writes work with the clones much faster Video RAM? While
- writing the pixel to video ram the processor continues to calculate the next
- TMapped pixel.
-
- >> That's TextDemo 5.7x (unreleased version) someone tested for me. 15-20
- >> FPS for a 68060/50 which is supposed to be 2-3 times as powerfull as a
- >> 486DX2/50 is quite depressing, considering that that 486 will do it at 30
- >> FPS. Now just translate that to the slower Amiga's (ie, the ones only
- >> equipped with 030's and 040's).
-
- SS> 1.2 meg, around 15 frames second used to copy the fastmem buffer to
- SS> chip. So 100mips*75% / 320*200*20 = 58.5 cycle per pixel rendered
- SS> in 060 local mem! that is HUGE! when you know that a 040 need ~10
- SS> cycle per pixel to do floor/ceiling gouraud shaded texture mapping.
-
- I doubt this 10 cycle routine of yours is very usefull for realistic purposes
- judging from all the 'unrealistic' TMap routines I've seen here lately (ones
- with rely on 64K boundaries or too big or too small Textures).
-
- The routine used to do (plain shaded) wall-mapping in TD takes 18 cycles/pixel
- (030 cycles). The floor/ceiling mapper is not the best possible anymore (I've
- seen a *usefull* trick presented here recently which I could have used in the
- floor/ceiling mapper).
-
- SS> why would a 50mhz 060 be 6 time slower then a 40mhz 040 when working
- SS> only in local fastmem?!?!!!?!??!!? (I assume here that you do gouraud
- SS> shade your quads)
-
- It DIDN'T work in local fastmem (did I say that?). This included C2P time. It
- was run in 320x240 1x1, 8-bit, full floors, ceilings and walls in DOOM style.
-
- Grtz John
-
- -----------------------------------------------------------------------
- John.Hendrikx@grafix.xs4all.nl TextDemo/FastView/Etc... development
- -----------------------------------------------------------------------
- -- Via Xenolink 1.985B5, XenolinkUUCP 1.1
-